Skip to content

update EdgeNGramTokenizer.DEFAULT_MAX_NGRAM_SIZE to be practical#13813

Closed
YeonghyeonKO wants to merge 2 commits intoapache:mainfrom
YeonghyeonKO:patch-2
Closed

update EdgeNGramTokenizer.DEFAULT_MAX_NGRAM_SIZE to be practical#13813
YeonghyeonKO wants to merge 2 commits intoapache:mainfrom
YeonghyeonKO:patch-2

Conversation

@YeonghyeonKO
Copy link
Contributor

issue : #13802

  • Many libraries(git code: Elasticsearch, OpenSearch) based on Lucene use NGramTokenizer.DEFAULT_MAX_NGRAM_SIZE instead of EdgeNGramTokenizer's when configuring an EdgeNGramTokenizer.
  • By the above reason, it's NOT practical to keep sticking DEFAULT_MAX_NGRAM_SIZE of EdgeNGramTokenizer to be ONE.

- Many libraries(git code: Elasticsearch, OpenSearch) use NGramTokenizer.DEFAULT_MAX_NGRAM_SIZE instead of EdgeNGramTokenizer's. 
- By the above reason, it's NOT practical to keep sticking DEFAULT_MAX_NGRAM_SIZE of EdgeNGramTokenizer to be ONE.
@YeonghyeonKO YeonghyeonKO marked this pull request as draft September 20, 2024 09:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant